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Experimental characterizations of a quantum system involve the measurement of expec- 
tation values of observables for a preparable state \ijj) of the quantum system. Such expec- 
tation values can be measured by repeatedly preparing \ip) and coupling the system to an 
apparatus. For this method, the precision of the measured value scales as -k= for N rep- 



etitions of the experiment. For the problem of estimating the parameter <j> in an evolution 
e -i(j>H ^ j t j s p 0SS ibie to achieve precision -k (the quantum metrology limit, see provided 
that sufficient information about H and its spectrum is available. We consider the more 
general problem of estimating expectations of operators A with minimal prior knowledge of 
A. We give explicit algorithms that approach precision given a bound on the eigenvalues 
of A or on their tail distribution. These algorithms are particularly useful for simulating 
quantum systems on quantum computers because they enable efficient measurement of ob- 
servables and correlation functions. Our algorithms are based on a method for efficiently 
measuring the complex overlap of \ip) and U\tp), where U is an implementable unitary op- 
erator. We explicitly consider the issue of confidence levels in measuring observables and 
overlaps and show that, as expected, confidence levels can be improved exponentially with 
linear overhead. We further show that the algorithms given here can typically be parallelized 
with minimal increase in resource usage. 

PACS numbers: 03.67.-a, 03.67.Mn, 03.65.Ud, 05.30-d 



Uncertainty relations such as Heisenberg's set fundamental physical limits on the achievable 
precision when we extract information from a physical system. The goal of quantum metrology 
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is to measure properties of states of quantum systems as precisely as possible given available re- 
sources. Typically, these properties are determined by experiments that involve repeated prepara- 
tion of a quantum system in a state p followed by a measurement. The property is derived from the 
measurement outcomes. Because the repetitions are statistically independent, the precision with 
which the property is obtained scales as -7=, where N is the number of preparations performed. 
This is known as the standard quantum limit or the shot-noise limit, and it is associated with a 
purely classical statistical analysis of errors. It has been shown that in many cases of interest, the 
precision can be improved to 4 by using the same resources but with initial states entangled over 
multiple instances of the quantum system, or by preserving quantum coherence from one experi- 
ment to the next. It is known that it is usually not possible to attain a precision that scales better 
than J*. (See yfl for a review of quantum-enhanced measurements.) A setting where this limit can 
be achieved is the parameter estimation problem, where the property is given by the parameter </> in 
an evolution e~ l ^ H for a known Hamiltonian i^T LI], which captures some common measurement 
problems. The standard method for determining <p requires the ability to apply e~ l ^ H and to pre- 
pare and measure an eigenstate of H with known eigenvalue. If it is not possible to prepare such 
an eigenstate or if we wish to determine expectations with respect to arbitrary states, this method 
fails. Here we are interested in the more general and physically important expectation estimation 
problem, where the property to be determined is an expectation (A) = tr(Ap) of an observable 
(Hermitian operator) or unitary A, for a possibly mixed state p. Both A and p are assumed to be 
experimentally sufficiently controllable, but other than a bound on the eigenvalues of A or their 
tail distribution, no other properties of A or p need to be known. In particular, we need not be 
able to prepare eigenstates of A or know the spectrum of A. The parameter estimation problem 
is a special instance of the expectation estimation problem. Parameter estimation reduces to the 
problem of determining tr(e~ l(f>H \tp)(ip\) for an eigenstate of H with non-zero eigenvalue. We 
show that for solving the expectation estimation problem, precision scalings of N l_ a for arbitrarily 
small a > can be achieved with sequential algorithms, and the algorithms can be parallelized 
with minimal additional resources. 

Our motivation for this work is the setting of quantum physics simulations on quantum comput- 
ers. This is one of the most promising applications of quantum computing Ol and enables a poten- 
tially exponential speedup for the correlation function evaluation problem 0,0,0]. The measure- 
ment of these correlation functions reduces to the measurement of the expectation of an operator 
for one or more states. Because the measurement takes place within a scalable quantum computer, 
the operators and states are manipulatable via arbitrarily low-error quantum gates. The quantum 
computational methods that have been described for the determination of these expectations have 
order ^= precision. An example is the one-ancilla algorithm for measuring (U) = tr(U p) for 
unitary U described in lELBIsl], which applies U conditional on an ancilla a prepared in a super- 
position state (Fig.[l]). Improving the precision without special knowledge of the operator or state 
requires more sophisticated algorithms. 

Here we give quantum algorithms based on phase and amplitude estimation H,[l(3] to improve 
the resource requirements to achieve a given precision. We begin by giving an "overlap estimation" 
algorithm (OEA) for determining the amplitude and phase of tr(£7 p) for U unitary. We assume that 
quantum procedures for preparing p from a standard initial state and for applying U are known 
and that it is possible to reverse these procedures. We determine the number of times N that these 
procedures are used to achieve a goal precision p and show that N is of order 1 /p. To determine 
tr(v4p) for observables A not expressible as a small sum of unitary operators, we assume that it is 
possible to evolve under A. This means that we can apply e~ tAt for positive times t. The OEA 
can be used to obtain tr(Ap)t « i{\x(e~ lAt p) — 1) for small t. The problem of how to measure 
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(2*?) = tr(I7p) 



FIG. 1: Quantum network for the one-ancilla algorithm to measure (U) = tr(Up) with 



(|0) a + 



\1) )/y/2 in the logical basis. The desired expectation is given by tr(Up) = (2<ri) = (ci ) + i(oy ), 
where {cr^} and (cry ) are the expectations of the Pauli matrices and a y ^ for the final state, which 



are estimated by repeating the experiment and measuring either c4 or a K y' on the control (ancilla) qubit. 
Because these measurements have ±1 as possible outcomes, their statistics are determined by the binomial 
distribution. 



tr(Ap) with precision p requires determining tr(e~ %At p) with precision better than pt and choosing 
t small enough that the error in the approximation does not dominate. We solve this problem by 
means of an "expectation estimation" algorithm (EEA) with minimal additional knowledge on the 
eigenvalue distribution of A. For this situation, the relevant resources are not only the number 
N of uses of e~ %At and of the state preparation algorithm, but also the total time T of evolution 
under A. We show that to achieve a goal precision p, N and T are of order l/(p 1+a ) and 1/p, 
respectively, with a > arbitrarily small. The term a in the resource bound is due partly to the 
tail distribution of the eigenvalues of A with respect to p. When it is known that p is an eigenstate 
of A, so the distribution is a delta function, a = 0. This applies to the parameter estimation 
problem. In the case where A is unbounded, a is still arbitrarily small if the tail distribution is 
exponentially decaying. But if only small moments of A can be bounded, in which case the best 
bound on the tail distribution decays polynomially, a becomes finite. 

It is important to properly define the meaning of the term "precision". Here, when we say that 
we are measuring tr(Ap) with precision p, we mean that the probability that the measured value 
Omeas is within p of tc(Ap) is bounded below by a constant c > 0. In other words, the "confidence 
level" that a meas — p < tr(Ap) < a mcas + p is at least c. Thus a meas ± p defines "confidence 
bounds" of the measurement for confidence level c. One interpretation of confidence levels is that 
if the measurement is independently repeated, the fraction of times the measured value is within 
the confidence bound is at least the confidence level. For measurement values a mca s that have an 
(approximately) gaussian distribution, it is conventional to use c = 0.68 to identify the precision 
p with the standard deviation. In this case, the confidence level that the measurement outcome 
is within xp can be bounded by erf(x/\/2), where eri(y) is the error function, erf(x/v / 2) > 
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This bound is often too optimistic, which is one reason to specify confidence levels 
explicitly. This becomes particularly important in our use of the "phase estimation" algorithm 
(PEA), whose standard version ^ has confidence levels that converge slowly toward 1 with x. 
Because of these issues, our algorithms are stated so that they solve the problem of determining 
tr(Ap) with precision p and confidence level c, where p and c are specified at the beginning. This 
requires that the resource usage be parameterized by both p and c, and we show that the resource 
usage grows by a factor of order | log(l — c)| to achieve high confidence level c. 
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An important problem in measuring properties of quantum systems is how well the measure- 
ment can be parallelized with few additional resources. The goal of parallelizing is to minimize 
the time for the measurement by using more parallel resources. Ideally, the time for the mea- 
surement is independent of the problem. Typically we are satisfied if the time grows at most 
logarithmically. It is well known that for the parameter estimation problem, one can readily par- 
allelize the measurement by exploiting entanglement in state preparation 111 ill . That this is still 
possible for the OEA and EEA given here is not obvious. In fact, we show that there are cases 
where parallelization either involves a loss of precision or requires additional resources. However, 
the entanglement method for parallelizing measurements works for expectation estimation and for 
overlap estimation when \tr(U p) | is not close to 1. 



II. OVERLAP ESTIMATION 

Let U be a unitary operator and p a state of quantum system S. We assume that we can prepare 
p and apply U to any quantum system S' that is equivalent to S. Both the preparation procedure and 
U must be reversible. In addition, we require that the quantum systems are sufficiently controllable 
and that U can be applied conditionally (see below). We use labels to clarify which quantum 
system is involved. Thus, p^ s '^ is the state p of system S' and U^ s '^ is U acting on system S'. This 
allows us to prepare p and apply U in parallel on multiple quantum systems. 

When we say that we can prepare p, we mean that we can do this fully coherently. That is, 
we have access to a unitary operator V^ SE ) that can be applied to a standard initial state |0) of 

S and an ancillary system E (environment) such that p^ = trE(V r ^ SE ^|0) s ^ E (0|(V r ^ SE ^)^). The state 
V( SE )|0) is a so-called purification of p^ s ^. For our purposes and without loss of generality, we 
can assume that p is pure by merging systems E and S and letting unitaries act on the merged 
system. With this simplification we can write p = = V|0)(0|V"^ and use S, S', . • • to refer 

to equivalent merged systems. The goal of the OEA is now to estimate the overlap (ip\U\ip) of 
with U\tp). 

The OEA and EEA require that S is sufficiently controllable. In particular, we require that it is 
possible to couple S to ancilla qubits and to implement conditional selective sign changes of |0) . 

Let P ^ = l( s ' — 2|0)^(0| be the selective sign change of |0) , with I^ s ^ the identity (or no-action) 
operator. If an ancilla (control) qubit is labeled a, an instance of the conditional selective sign 
change is defined by 

T (aS) = |0) a a (0|I (s) + |l) a (l|P (S) - (1) 
If S consists of qubits and |0) is the usual starting state with all qubits in logical state |0), then this 



is essentially a many-controlled sign flip and has efficient implementations 111211 . 

As mentioned above, for the OEA we require that U can be applied conditionally. This means 
that the unitary operator 

V [aS) = |0) a a (0|I (s) + |1 ) a ( 1\U {S) (2) 
is available for use. When U is associated with an evolution simulated on a quantum computer, 



this is no problem since all quantum gates are readily "conditionalized" 11211 . Nevertheless, we 
note that V is not required if only the amplitude INAU]^) | of (ijj\U\ijj) is needed. 

The "amplitude estimation" algorithm (AEA) [10] can almost immediately be applied to obtain 
|( , 0|I7|'0)|. To accomplish our goals we need to adapt it for arbitrarily prepared states and use a 
version that avoids the complexities of the full quantum Fourier transform [13]. Before we describe 
and analyze the version of the AEA needed here, we show how the OEA uses it to estimate the 
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phase and amplitude of (ip\U\ip). Let AE(U, \ip),p) be the estimate of obtained by the 

AEA for goal precision p. (We specify the meaning of the precision parameter below.) 

Overlap estimation algorithm: Given are U, (in terms of a preparation unitary V : |0) i— > 
and the goal precision p. An estimate of (iJ)\U\ip) is to be returned. 

1. Obtain a = AE(U, |?/>),p/4), so that a is an estimate of | (tp\U\ip) | with precision p/A. 

2. Obtain 6 = AE(^ (aS) , |+^ s = |+) a |^),p/16). 
Notethat aS (+^| c [/ (aS) |+^) aS = (1 + (tfj\U\^})/2. 

3. Obtain 6 V2 = AE(e i<T ~~ (a) "/ 4t £/ (aS) , |+^> p/16). 

Note that aS (+^|e iffz<a)7r/4 T/ (aS) ) = e i7r/4 (l - i(ip\U\ip)) /2. 

4. Estimate the phase 9 of (ip\U\ijj) by computing the argument of the complex number y 
defined by 

Re(y) = (46g -a 2 - l)/2, 

lm(y) = (4b 2 n/2 -a 2 - l)/2. (3) 

If a, b and b w / 2 were the exact values of the amplitudes estimated by the three instances of 
the AEA, then we would have y = (^\U\ip). For example, the formula for Re(y) may be 
obtained by geometrical reasoning, as shown in Fig. [2j 

5. Estimate (ip\U\ip) as e t9 a. The reason for not using y directly is that if the overlap has 
amplitude near 1, then the error in the amplitude of y can be substantially larger than the 
error in a. (This is because of the way we estimate y using a PEA; see below.) 

We define OE(£7, \ip),p) to be the value returned by the OEA. A flowchart for the algorithm is 
depicted in Fig.|3] 

When a = \(ip\U\il>)\ is close to 1, the absolute precision with which a is obtained is as much 
as quadratically better for the same resources. To avoid this nonuniformity of the precision to 
resource relationship, we define the precision 5 of an overlap by means of a parameterization of 
{^jj\U\ijj) using the points (xi, x 2 , x 3 ) on the upper hemisphere of the surface of a unit sphere in 
three dimensions. For this purpose, define h(x\, x 2 , x 3 ) — X\ + ix 2 for x\ + x\ + x\ = 1 and 
x 3 > 0. Define the distance between (x x , x 2 , x 3 ) and (x[, x' 2 , x' 3 ) to be the angular distance along 
a great circle. The precision of the value o returned by the OEA is determined by the distance 
5 between the liftings h~ l (o) and h~ l ((il)\U\il))) (see Fig. |4j). We define the precision of the 
value returned by the AEA similarly, by restricting the parametrization to the positive reals. The 
precision parameters with which the AEA is called in the OEA are chosen so that the returned 
overlap has precision 8 < p with respect to our parametrization (see Note J3]). 

The AEA is based on a trick for converting amplitude into phase information, so that an efficient 
PEA can be applied. Let |V'o) = \if>) and = U\ij>). Let S = I- 2|^o)(^o| = VP V^ be the 
selective sign change of |i/> ) and Si — I — 2|'0i)(0i| = UVPoV^W the selective sign change of 
The composition S = SoSi is a unitary operator that rotates | V^o) toward j^i) in the two- 
dimensional subspace Q spanned by |^ ) an d IV'i)- The rotation is by a Bloch-sphere angle of 
2<p = 4arccos(|( , i/'o|V ; i)l)- Thus, the eigenvalues of S in Q are e ±% ^. The Bloch sphere picture of 
the states and the rotation are shown in Fig. 13 When | (^ol^i) I = I {^\U\i^) \ — 1, S is the identity 
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FIG. 2: Geometrical construction for computing Re((ip\U\ip)) from a = \(ip\U\tp}\ and 2bo = |(1 + 
{ip\U\ip))\. According to the law of cosines, (26 ) 2 = a 2 + 1 + 2acos(6>), and we have Re«^|?7|^)) = 
acos(8) = ((26 ) 2 - a 2 - l)/2. 



DO: 

a=AE(U,\1>),l) 
a*\^\U\v)\ 



ESTIMATE y: 

(with precision p) 



INPUT 


u,\% 


lj),p 
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r 

b =AE(<U {3 


> 

S) ,|+^)aS,^) 


& ~|l + (VW>|/2 

v. J 







My) = (46g- a 2 -l)/2 
Im(y) = (4&2 /2 _ a 2 - l)/2 
lHVW> = OE(tf,|^>,p) 



6 V2 =AE(C/,|+^) 3S ,^) 

= e i4 a) T/4 c[ /(aS) 

^/ 2 ~|l-i(v|llv)|/2 



FIG. 3: OEA flowchart. An estimate of the overlap (^|£/|^>) is obtained. The algorithm requires three state 
preparations and calls the AEA three times. The amplitude of the returned value shown in the flowchart 
may need to be adjusted according to the value of a to optimize the precision. For details see the text. 
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FIG. 4: Visualization of the parameterization of the overlap in terms of points on the upper hemisphere of 
a unit sphere. The function h is denned by h(xi,X2,x^) = x\ + 1x2. Note that for overlaps |(V'|C/|V ; )| 
approaching 1 and small 8, 5' approaches 5 2 /2 <C 5. 

operator. The PEA for S with initial state | ip ) determines the phase <p of one of these eigenvalues, 
where each of the signs has equal probability of being returned. The overlap | {il)\U\ijj) | is obtained 
from (p by the formula | (ijj\U\ijj) \ = cos(</>/2). The PEA requires use of the conditional S operator, 
C S. As defined, this needs to be decomposed into a product of C P , V and < V. A significant 
simplification is to not condition U and V and to write C S = V c PqV^UV c PqV^U^ . This works 
because if the controlling qubit is in state |0), all the £/'s and V's are canceled by matching W's 
and yt> s {j. 

Let PE(W, \ip'),p) be a phase returned by the PEA for unitary operator W and initial state \ip') 
with precision goal p. The AEA may be summarized as follows. 

Amplitude estimation algorithm: Given are U, \ifj) (in terms of a preparation unitary : |0) 1 — 
|-0)) and the goal precision p. An estimate of | (%jj\U\%jj) | is to be returned. 

1. Let <j) = PE(S, \ip) : 2p) with S = S S l = VP V^UVP V^W. 

2. Estimate \{ip\U\ip)\ as |cos(0/2)|. 

The precision parameter for the PEA has the conventional interpretation (modulo 2tt). Because 
arccos(\ (ijj\U\ijj)\) is the angle along the semicircle in the parametrization of the overlap defined 
above, the precision 2p of the value returned by the PEA translates directly to the desired precision 
in the value to be returned by the AEA. 

The PEA BSD for a unitary operator W and initial state \tp') returns an estimate of the phase <fi 
("eigenphase") of an eigenvalue e l<t> of W, where the probability of is given by the probability 
amplitude of \ip') in the e l9i -eigenspace of W. In the limit of perfect precision, it acts as a von 
Neumann measurement of W on state in the sense that the final state is projected onto the 
e^-eigenspace of W. For finite precision, the eigenspaces may be decohered and the projection 



8 




FIG. 5: Bloch sphere picture of the rotations induced on the subspace spanned by \ip) and U\ip) by the 
operators So and Si . 



is incomplete, unless there are no other eigenvalues within the precision bound. The error in the 
projection is related to the confidence level with which the precision bound holds. 



The original PEA is based on the binary quantum Fourier transform II13I1 . It determines an 
eigenphase <fi with precision i with 2 n — 1 uses of the conditional W operator to obtain a phase 
kickback to ancilla qubits. The original PEA begins by preparing n qubits labeled 1 ... n in state 
|+^ . . . |+) n and system S in state W) s - Next, for each m = 1, . . . , n, W is applied from qubit m 
to system S 2 m ~ 1 times. The binary quantum Fourier transform is applied to the n qubits, and the 
qubits are measured in the logical basis |0), 1 1). The measurement outcomes give the first n digits 
of the binary representation of 0/(27r) + e/2 n , where |e| < 1/2 with probability at least 0.405 

The PEA as outlined in the previous paragraph makes suboptimal use of quantum resources. We 
prefer a one-qubit version of the algorithm based on the measured quantum Fourier transform lfl5ll 
that has been experimentally implemented on an ion trap quantum computer iflrjl . An advantage 
of this approach is that it does not require understanding the quantum Fourier transform and is 
readily related to more conventional approaches for measuring phases. To understand how the 
algorithm given below works, note that the eigenstates of W are invariant under W. The only 
interaction with S is via uses of W. Therefore, without loss of generality, we can assume that S 
is initially projected to an e^-eigenstate of W with < (f) < 2n. The bits of an approximation 
of 0/ (27r) are determined one by one, starting with the least significant one that we wish to learn. 
Given n, let [.bi . . .b n ] 2 = Y^i=i (with = 0, 1) be a best n-digit binary approximation to 
0/ (27r), where the notation [x\2 is used to convert a sequence of binary digits x to the number that 
it represents. Write e = (0/(2vr) - [.b x . . . b n ] 2 )2 n . 

Phase estimation algorithm: Given are W, (as a state of a quantum system) and the goal 
precision p. An estimate of an eigenphase of W is to be returned, where the probability 
of (p is given by the population of \ip') in the corresponding eigenspace. 

0. Let n be the smallest natural number such that 2™ > 1 /p. 
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FIG. 6: Step 2 of the PEA to estimate bit k of the eigenphase, where k = 3. The phase 0^ is computed 
according to previously obtained information about the eigenphase. By applying it before the measurement, 
the probability of obtaining the optimal value for bit k is maximized. The measurement is denoted by the 
triangle pointing left with +/— inside and is a measurement in the |+)/| — ) basis. The outlined part of the 
network will be parallelized in Sect. [V] 

l.a. Prepare |+) in an ancilla qubit a and apply °W^ S ' 2 n ~ 1 times. With the auxiliary 
assumption that \ r tp') is an e'^-eigenstate of W, the effect is a phase kickback, changing 

|+) a to (|0) a + e* n - 1 *\l\)/V2. 

1. b. Measure a in the |+), |— ) basis, so that measurement outcome (1) is associated with 

detecting |+) (|— )). Let b' n be the measurement outcome. With the auxiliary assump- 
tion, the probability that b' n = b n is cos(7re/2) 2 . 

2 Do the following for each k — (n — 1), . . . , 1: 

2. a Prepare |+) in an ancilla qubit a and apply 2 k ~ 1 times. With the auxiliary 

assumption, this changes |+) to (|0) + e l2fe_1< ^|l) )/\/2. 

2.b Compensate the phase of |1) by changing it by e~ lw ^ b ' k + 1 '" b ' n ^ 2 . With the auxiliary 
assumption, this changes the state of the ancilla to (|0) a +e i(2 * : ~ 1 ^ 7r[ - 6 '=+i- 6 '^ 2 ) |1) )/-y/2. 

2.c Measure a in the |+), | — ) basis to obtain b' k . With the auxiliary assumption and if 

b\ = bi for I > k, the probability that b' k = b k is cos(7re/2"~ fc+1 ) 2 . 

3 Estimate as 2n[.b' 1 . . . b' n }2. 

A step of the algorithm is depicted in Fig. [6j 

The probability P(e) that the value returned by the PEA is 2n[.b\ . . . b n ]2 is the product of the 
probabilities cos(7re/2') 2 for I = 1, . . . ,n and is bounded below by sin(7re) 2 /(7re) 2 . This bound 
can be obtained by taking the limit n — > oo in P(e). The worst case is given for |e| = 1/2, leading 
to the bound P(e) > 4/vr 2 w 0.405 Since the goal precision is 2 n , it is acceptable for the 
algorithm to obtain the next best binary approximation to <fi. For this, the value obtained for b' n may 
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not be the one with maximum probability, but the subsequent bits b' k are always the best possible 
given b' n . Taking this into account, the probability that the phase returned is within 2~ n is given by 



P(e) + P(l - e) > 8/tt 2 « 0.81 (see Note (jj). 

The key step of the one-qubit phase estimation procedure is to modify the phase kickback by 
the previously obtained phase estimate. This differentiates it from an adaptive phase measurement 
method that determines the bits of an approximation of (p/(2n) starting with the most significant 
bit, and making sufficiently many measurements with different phase compensations for each bit 
to achieve high confidence level. This is the phase estimation method given in 111 811 and mentioned 
in JU], which approximates what is done in practice for the efficient determination of an unknown 
frequency or pulse time. 

The resources required by the PEA, AEA and OEA can be summarized as follows. 

PE(W, W),p): This requires N(p) = 2^°^ l / p ^ - 1 uses of W. is prepared once. Here, \x] 
denotes the least integer m > x. 

AE(U, \ip),p): This calls PE once. It requires N(2p) uses of S = VP V j UVP V j W and one 
use of V to prepare the initial state. We count this as being equivalent to 4iV(2p) + 1 state 
preparations and 2N(2p) applications of U . 

OE({7, \ip),p): This contains three calls to the AEA with higher precision. The total resource 
count is 8N(p/8) + 4N(p/2) + 3 state preparations and 4N(p/8) + 2N(p/2) uses of U. 

Since N(p) is of order l/p, each of these algorithms uses resources of order I /p. 



III. CONFIDENCE BOUNDS 

The PEA as described in the previous section obtains an estimate </> cst of an eigenphase <p such 
that the prior probability that |0 es t — <f>\ < 2~ n+1 ir is at least 0.81, regardless of the value of 0, 
where n = |"log 2 (l/p)] . (The comparison of </> cst to <fi is modulo 2tt, so that |0 est — <f>\ is angular 
distance between e^ cst and e^.) Thus, after having obtained cst , we say that = cst ± 2~ n+1 7r 
with confidence level 0.81 or P[0 cst - 2~ n+1 7i < <p < cst + 2" n+1 7r] = 0.81. The error bound 
of 2~ n+1 7r must not be confused with a standard deviation. Suppose that we use a single sample 
from a gaussian distribution with standard deviation a to infer the mean. We would expect that 
the confidence level increases as 1 - e-^'^ for an error bound of A. (The notation Q(x) 
means a quantity asymptotically bounded below by something proportional to x, that is, there 
exists a constant C > such that the quantity is eventually bounded below by Cx.) In general, it 
is desirable to have confidence levels that increase at least exponentially as a function of distance 
A or as a function of additional resources used. Unfortunately, for a single instance of the PEA, 
we cannot do better than have confidence level 1 — 0(1/ A) for <p = <Pcst ± 2~ n+1 nA ||9|]. (Here, 
0(x) denotes a quantity that is of order x, that is a quantity that is eventually bounded above by 
Cx for some constant C. The meaning of "eventually" depends on context. Here it means "for 
sufficiently small x". If the asymptotics of the argument require that it go to infinity, it means 
"for sufficiently large x".) The method suggested in ||9|] for increasing the confidence level is to 
use the PEA with a higher goal precision of p/2 l . However this improves the confidence level on 
4> — 0cst ± 2~ n+1 7rA to only 1 — Vi(l/(A2 1 )) and requires a 2 l resource overhead, which is not an 
efficient improvement in confidence level. 

A reasonable goal is to attain confidence level c = 1 — e~ n( - r ^ that = est ± 2~ n+1 ir with 
a resource overhead of a factor of 0(r). This modifies the resource counts from the previous 



11 



section from 0(1/ p) to 0(\ log(l — c)\/p), where c is the confidence level achieved. To attain this 
goal, we modify each step of the PEA by including repetition to improve the confidence level that 
acceptable values for the bits are determined. Let the two nearest n-digit binary approximations 
to 0/(2tt) be given by 0/(2tt) = [.61 . . . b n ] 2 + 5/2 n and 0/(2tt) = [.h . . . b n } 2 + (6- l)/2 n , 
where < 8 < 1. We wish to obtain one of these approximations with high confidence level. 
For the first step of the PEA, we perform two sets of r experiments to obtain a good estimate of 
5' = tt(5 + b n ). The first set consists of r (1+) , I— ) ) -measurements of the state W 2 ™ 1+) lijj) . 

a a a 5 

The second consists of r (|+) a , |— ) a ) -measurements of the state W 2 " (|0) a — i|l) a )/v^2|^) s - Let 
xi,X2 be the sample means of the measurement outcomes of the two sets of experiments. In the 
limit of large r, x\ and x 2 approach sin(5'/2) 2 and sin(5'/2 — vr/4) 2 , respectively. We have 

sin(5') = cos(5' - tt/2) = 1-2 sin(572 - tt/4) 2 , cos(5') = 1-2 sin(572) 2 , (4) 

so we can estimate 5' from x\ and x 2 by letting 5' cst be the phase of the complex vector (1 — 2xi) + 
i(l — 2x 2 ). The probability of the event E that 5' differs from 5' est by more than it /4 modulo 2n 
can be bounded as follows. For this event, | sin(5') + i cos(5') — ((1 — 2x\) + i(l — 2x 2 ))\ 2 > 1/2. 
It follows that either | sin(<572) 2 -xt\ > 1/4 or | sm(5'/2 - 7r/4) 2 -x 2 \ > 1/4. The probability of 
each of these possibilities is bounded by the probability that the mean of r samples of the binomial 
distribution with probability p of outcome 1 differs from p by at least x = 1/4. The probability 
of this event is bounded by 2e~ 2rx2 = 2e~ r / & (Hoeffding's bound ifl^l '). This bound can now be 
doubled to obtain a bound of 4e~ r//8 on the probability of E. 

Let a n — 1 if 5' cst is closer to n than 0, and a n = otherwise. Then a n = b n or a n = b n . 
Which equality holds does not affect the subsequent arguments, so without loss of generality, 
assume that a n = b n . Suppose that event E did not happen and that we have correctly obtained 
o-n = b n , . . . , dk+i = b k+1 . For the step of the algorithm that determines the fc'th bit, modify 
the original step by compensating the phase of |l) a by e _i ( 7r [- 6fc + 1 --- 6n - 1 l 2+,5 ^ t / 2 ' 1 ) and repeating 
the measurement r times. We set — 1 if the majority of the measurement outcomes is 1 and 
a/; = otherwise. For each measurement, the probability that the measurement outcome does 
not agree with b k is at most sin((<5 / — S' est ) /2 n ~ k+1 ) 2 . Our assumptions imply that this is at most 
sin(7r/2 rt ~ fe+3 ) 2 < (it / 2 n ~ k+3 ) 2 . Using Hoeffding's bound again, the probability that a k ^ b k is 
bounded by 2e~ 2r ( 1 / 2_ ( 7r / 2 ™~ fe+3 ) 2 ) < 2e~ r l 2 (for a loose upper bound). 

Summing the probabilities, we find that the probability that we do not learn b\ . . .b n oxb\ . . .b n 
is bounded by x(n, r) = 2{n — l)e~ r//2 +4e~ r / 8 . We can therefore say that the modified PEA yields 
the desired phase to within 7i/2 n ~ 1 with confidence level 1 — x(n,r), where x(n,r) decreases 
exponentially in r. Note again that this confidence bound still should not be confused with a 
similar confidence bound for a gaussian random variable. Increasing the confidence bound does 
not result in the expected increase in confidence level. In order to have confidence level increasing 
exponentially toward 1 with increasing confidence bound and an additional overhead of at most 
0(| log(p) |), we can repeat the determination of the fc'th bit 2 n ~ k r instead of r many times. 

For the purpose of having high confidence level in the precision with which a quantity is es- 
timated, our algorithms require the confidence level goal as an input. The modified PEA may be 
outlined as follows. 

Modified Phase estimation algorithm: Given are W, \ip') s , a goal precision p and a goal confi- 
dence level c. An eigenphase of W is to be returned, where the probability of <\> is given 
by the population of in the corresponding eigenspace. The final state of S consists of 
states with eigenphases in the range <\> ± p with prior probability at least c. 
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0. Let n be the smallest natural number such that 2™ > 1/p. Let r be the smallest natural 
number such that x(n, r) < (1 — c). 

1 . Obtain <5g St with the two sets of r measurements described above. Let a n = 1 if 5' est is closer 
to 7r than and a n = otherwise. 

2 Do the following for each k = (n — 1 ),..., 1, in this order: 

2. a Obtain an estimate of the fc'th bit a k of a binary approximation to 0/(27r) by r repeti- 
tions of the measurement of steps 2.a-c given previously, but with a phase compensa- 
tion that uses <^ st as well as the previously obtained bits. 

3 Return 27r[.ai . . . a n ] 2 . 

We define PE( W, \ip') , p, c) to be the value returned by the modified PEA. 

The resources required grow by a factor of less than 2r, where r = 0(\ log(l — c)|). The 
constant hidden by the order notation may be determined from the expression for r in step and 
is not very large. To modify the AEA to attain confidence level c, it suffices to change the call to 
PE by including c as an argument. Because the OEA has three independent calls to the AEA, it 
needs to make these calls with confidence level arguments of 1 — (1 — c)/3 to ensure that the final 
confidence level is c. The resource requirements of all three algorithms are 0( \ log(l — c)\/p), 
where this applies to both the uses of U and of the state preparation operator V in the case of the 
AEA and OEA. 

IV. EXPECTATION ESTIMATION 

Let A be an observable and assume that it is possible to evolve under ±A for any amount of 
time. This means that we can implement the unitary operator e~ tAt for any t. The traditional 
idealized procedure for measuring (A) = tr(Ap) is to adjoin a system consisting of a quantum 
particle in one dimension with momentum observable p and apply the coupled evolution e~ lA ®v 
to the initial state p ® |0)(0|, where |0) is the position "eigenstate" with eigenvalue 0. Measuring 
the position of the particle yields a sample from the distribution of eigenvalues of A o, 12011 . 
This procedure requires unbounded energy, both for preparing |0) and to implement the coupled 
evolution. Performing this measurement N times yields an estimate of (A) with precision of order 
vax(A)/y/N, where the variance is vai(A) = ((A— (A)) 2 ). It is desirable to improve the precision 
and to properly account for the resources required to implement the coupling. 

We focus on measurement methods that can be implemented in a quantum information pro- 
cessor. In order to accomplish this, some prior knowledge of the distribution of eigenvalues of 
A with respect to p is required. Suppose we have an upper bound b on |tr(Ap)| and a bound on 
the tail distribution F(A) > tr([\A - (A)\ > A]p), where [\A - (A)\ > A] denotes the pro- 
jection operator onto eigenspaces of A with eigenvalues A satisfying |A — (A)\ > A. That is, 
F(A) > Y^\\-(a)\>aP* w ^ m = tr (l^)(^lp)- Without loss of generality, F is non-increasing in 
A. An estimate on the tail distribution is needed to guarantee the confidence bounds on tr(Ap) 
derived from measurements by finite means. Here are some examples: If the maximum eigenvalue 
of A is A max , we can set b = A max and use F(A) = 1 if A < A max and F(A) = 0, otherwise. 
Suppose that we have an upper bound v on the variance var(A). If we know that the distribution 
of eigenvalues of A is gaussian, we can estimate F(A) by means of the error function for gaussian 
distributions. With no such prior knowledge, the best estimate is F( A) = min(l, v/A 2 ). (Observe 
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that v > A 2 J2\x-(a)\>a Pa-) Such "polynomial" tails result in significant overheads for measuring 
{A). "Good" tails should drop off at least exponentially for large A ("exponential tails"). 

We give an EEA based on overlap estimation. The relevant resources for the EEA are the 
number M of times a unitary operator of the form e~ tAt is used, the total time T that we evolve 
under A, and the number N of preparations of p. The total time T is the sum of the absolute 
values of exponents t in uses of e~ lAt . For applying the OEA, it is necessary to be able to evolve 
under —A as well as A. If the evolution is implemented by means of quantum networks, this 
poses no difficulty. However, if the evolution uses physical Hamiltonians, this is a nontrivial 
requirement. The complexity of realizing e~ lAt may depend on t and the precision required. Since 
this is strongly dependent on A and the methods used for evolving under A, we do not take this 
into consideration and assume that the error in the implementation of e~ lAt is sufficiently small 
compared to the goal precision. In most cases of interest this is justified by results such as those 



in Il21ll . which show that for a large class of operators A, e~ tAt can be implemented with resources 
of order t 1+a ' /e a ' , where e is the error of the implementation and a' is arbitrarily small. 

For exponential tails F, our algorithm achieves M,N = 0(l/p 1+a ) and T = 0(l/p) for 
arbitrarily small a. The order notation hides constants and an initialization cost that depends on b 
and F. The strategy of the algorithm is to measure tr(e~ lAt p) for various t. In the limit of small 
t, ti(e~ iAt p) = 1 + 0(t 2 ) - i((A)t + 0(t 3 )), so that (A) can be determined to 0(t 3 ) from the 
imaginary part of tr(e~ tAt p). The first problem is to make an initial determination of (A) to within 
a deviation of A as determined by F. This is an issue when b is large compared to the deviation. 
To solve the first problem, we can use phase estimation. We also give a more efficient method 
based on amplitude estimation. The second problem is to avoid excessive resources to achieve the 
desired precision while making t small. To solve this problem requires choosing t carefully and 
taking advantage of higher-order approximations of (A) by linear combinations of tr(e~ lAt p) for 
different times t. 

To bound the systematic error in the approximation of (A) by itr(e~ iAt p), note that |Im(e l61 ) — 
9\ < 9 3 /6. To see this it is sufficient to bound the Lagrange remainder of the Taylor series 
of sm(9). This bound suffices for achieving a = 1/2 in the bounds on M and N. Reducing a 
requires a better approximation, which we can derive from the Taylor series of the principal branch 
of ln(x + 1). For |a;| < 1, 



A' 



I ln(x + 1) - ^{-l) k - 1 x k /k\ < \x\ K+l /((K + 1)(1 - \x\) K+1 ). (5) 

k=l 

To apply these series to the problem of approximating (A), we compute 

K K 

^{-l) k -\e- lBt - l) k /k = Cie~ iBl \ (6) 



k=l 1=0 
)A 



for real constants C\ satisfying \C{\ < 2 . In particular, if B is an operator satisfying \B\ < x/t, 
we can estimate 



A 



ttr(Bp) + Im tr(e~ lBlt p) 



1=0 



< H A ' + 7((A^ + l)(l-|x|)^ +1 ). (7) 



Define G e (A) = AF(A) + F(s)ds. Then G e (A) is an upper bound on the contribution 
to the mean from eigenvalues of A that differ from the mean by more than A. That is, G e (A) > 
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x-(A)\>a 1^ — ("4)1 Pa- Like F(A), G e (A) is non-increasing. 
We assume that a non-increasing bound G(A) > G e (A) is known and that G(A) —> as A — > oo. 
Because F(A) < G e (A) /A, we can use G to bound both G e and F. For x > 0, define G~ 1 (x) = 
inf{A|G(A) < x}. The behavior of G _1 as x goes to determines the resource requirements 
for the EEA. If A is a bounded operator with bound A max , then we can use G~ 1 (x) < A max 
independent of x > 0. If F is exponentially decaying, then so is G, and G^ 1 (x) = 0(\ log(x)\). 
For polynomial tails with F(A) = 0(1/A 2+/3 ), we have G(A) = 0{l/A l+l3 ) and G^ 1 {x) = 

( 1/x l/(l+/3))_ 

The EEA has two stages. The first is an initialization procedure to determine (A) with an 
initial precision that is of the order of a bound on the deviation of A from its mean, where the 
deviation is determined from F and G. This initialization procedure involves phase estimation to 
sample from the eigenvalue distribution of A. Its purpose is to remove offsets in the case where 
the expectation of A may be very large compared to the width of the distribution of eigenvalues 
as bounded by F and G. The second stage zooms in on tr(Ap) by use of the overlap estimation 
procedure. As before, we can assume without loss of generality that p is pure, p = \ip)(ip\. We 
first give a version of the EEA that achieves M,N — 0(l/p 3 ^ 2 ) and then refine the algorithm to 
achieve better asymptotic efficiency. 

Expectation estimation algorithm: Given are A, (in terms of a preparation unitary V : 
|0) i— > a goal precision p and the desired confidence level c. The returned value is 
within p of (A) = tr (A\il>) with probability at least c. 

Stage I. 

0. Choose A such that F(A/2) < 1/4 and A > p. A should be chosen as small as 
possible. Let U = n /(A(b + A)). Let r be the minimum natural number such that 
2e~ r / 8 < (1 — c)/4 and set d according to the identity r(l — d) = (1 — c)/4. 

1. Obtain Ai,...,A r from r instances of the PEA, A k = PE(e~ iAti , AU/2, d), 
where 2% is subtracted for any return values between n and 2n to ensure that — n < 

A k < 71. 

2. Let A m be the median of Ai, . . . , A r . We show below that the probability that \A m /ti + 
(A) | > A is bounded by 2e~ r ^ + r (1 - d) < (1 - c)/2. 

3. Let ao = — A m /tj. We expect ao to be within A of (A) with confidence level 1 — (1 — 

c)/2. 

Stage EL If p = A, return a and skip this stage. 

0. Choose ^max and t so that they satisfy 

(A) e 3 m j6 < (t/2)p/4, 

(B) G(9 ma Jt) < 9 maxP /8, 

(C) ^max < 1) 

(D) tA < max . (8) 

The constraints and how they can be satisfied are explained below. The parameter t 
should be chosen as large as possible to minimize resource requirements. 

1. Obtain x = OE( e - i(A - ao)(t/2) , |^), (t/2)p/4, 1 - (1 - c)/2). 

2. Return -Im(x)/(t/2) + a . 
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Consider stage I of the algorithm. The probability that \A m /ti + (A)\ > A may be bounded 
as follows. The choice of t { ensures that eigenvalues A of — Ati within Ati of the mean are 
between ±7r/4 and do not get "aliased" by e~ tAti in the calls to the PEA. With probability at 
least 1 — r(l — c'), each A fc returned by these calls is within tjA/2 of an eigenvalue of — Ati 
sampled according to the probability distribution induced by \tp). Assume that the event described 
in the previous sentence occurred. The probability that | A m / tj + ( A) | > A is upper bounded by the 
probability that at least \r/2] of the r samples fall outside the range [—{A)ti — Ati, — (A)ti + Ati]. 
The choice of A with respect to F implies that Hoeffding's bound can be applied to bound this 
probability by 2e _r / 8 . Thus, we can bound the overall prior probability P that |A m /tj + (A) \ > A 
by P < 2e~ r / 8 + r(l - d) < (1 - c)/2. 

The resources required for stage I include N = r = 0(\ log(l — c)|) preparations of 
M — 0(| log(l - c) | (b + A) /A) uses of e~ iAs (specifically, M is within a factor of 2 of 2r / AU) 
and a total evolution time of T = 0(| log(l — c)|/A) (where T is within a factor of 2 of 2rA). 
Note that none of these resource bounds depend on the p and that A is a bound on a deviation of A 
from the mean with respect to Also, if A is of the same order as b, the formulation of stage I 
of the algorithm is such that the uses of phase estimation require minimal precision. In fact, in this 
case, stage I of the algorithm could be skipped with minor adjustments to stage II. We show below 
that stage I can be modified so that the overhead as a function of b is logarithmic. The modification 
requires that the number of state preparations is of the same order as M. 

In the special case of parameter estimation (see the introduction), A = p. Consequently stage 
II is skipped and the resources of stage I are the total resources required. The algorithm therefore 
achieves the optimal 0(l/p) resource requirements for this situation. 

Consider stage II of the algorithm. The error | — Im(x) / (t/2) + a — (A) | may be bounded as 
follows. We assume that all the precision constraints of stage I and II are satisfied. The confidence 
level that this is true is c overall. With this assumption, xj (t/2) is withinp/4 (the "precision error") 
of tr(e~^ A_a °^* //2 V)/(t/2). There are three contributions to the "approximation error", which is 
the difference between — Imtr(e~ 4( ^ 4 " ao ^*/ 2 )p)/(t/2) and tr((A — ao)p). For all contributions, 
we have to consider the fact that ao approximates (A) to within only A, which is why we need 
constraint (D) of Eq. ©. The first arises from eigenvalues of (A — ao)[t/2) in [—9 ma _ x , +# max ] 
due to |Im(e 4 ' 9 ) — 9\ not being zero and is bounded by #^ ax /(6(t/2)) = p/A (constraint (A) of 
Eq. ©). The second and third come from eigenvalues of (A — a )(t/2) outside [— max , +0 max \. 
Constraint (D) of Eq. © implies that |(a - (A))(t/2)\ < 9 max /2. Constraints (B) and (C) 
of Eq. © imply that the contribution to (A) of eigenvalues differing from the mean by more 
than 6 l max /(2(t/2)) is at most 9 max p/8 < p/8. However the same eigenvalues still contribute 
to the measurement, each contributing at most 1 to x. Constraint (B) of Eq. © together with 
the inequality F(A) < G(A)/A imply that F(0 max /(2(£/2))) < tp/8 so this contribution has 
probability at most tp/8 and therefore adds at most another p/A (after dividing by t/2) to the 
approximation error. Thus, the combination of the approximation and precision error is less than 
p, as desired. Clearly these estimates are suboptimal, tighter choices of 9 max and t could be made. 
However, this does not affect the asymptotics of the resource requirements. 

To find good solutions # max and t subject to the constraints given in Eq. ©, we can rewrite the 
constraints as follows: 

(A) G~\9 maxP /8) < 9 max /t < ( P /8)/(9 2 m j6), (9) 
(B') 6> max < 1, 9 mSLX /t > A. 

The first inequality of (A') is implied by constraint (B) and the second by constraint (A) of Eq. ©. 
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To satisfy these constraints, we first find 6> max < 1 as large as possible so that 

(A") A < G-\9 maxP /8) < (p/8)/(Cax/6), (10) 

and then set t = 9 max /G~ 1 (9 raa _ x p/8). Consider the three examples of bounded, exponential and 
polynomial tails. For the case of bounded tails, constraint (A") of Eq. (flOb can be solved by setting 
9 max according to A max = (p/8)/(^ ax /6), so that 9 max = (3p/(4A max )) 1 / 2 . The parameter t is 

given by 9 max /X max = (3p/4) 1 / 2 /Amf x = Qip 1 ^ 2 ). For the case of exponential tails, we can use 

G-\x)_= 0(| log(x)|) to show that # max = fi((p/| log(p)l) 1 / 2 ) and t = fi(p 1/2 /| log(p)| 3 / 2 ) (see 

Note Q). For polynomial tails with G~ l {x) = C^ar 1 ^ 14 ^), we get 9 max = fi(p {2+/3)/(1+2/3) ) 
and t = fi( i3 (5+6/3+/3 2 )/((i+/3)(i+2 /3 )) ) (see Note 

The resource requirements for stage II of the EEA can be estimated as M = 0( \ log(l — 
c)\/tp) uses of an exponential of the form e~ lAs , N = 0(\ log(l — c)\/tp) state preparations, 
and a total time of T = 0(\ log(l — c)\/p), in terms of the parameter t computed in step (of 
stage II). The dependence on G shows up in the value of t. With t as computed in the previous 
paragraph, for bounded A, M and N are 0(\ log(l — c)|/p 3 / 2 ). For exponential tails, M and N 
are 0(| log(l — c)\/(p/\ log(p)|) 3 / 2 ). For polynomial tails, they are 0(\ log(l — c)|/p 7 ^), where 
7(/3) is a polynomial satisfying ^{(3) — > 1 + 1/2 for /3— >oo and = 1 + 5 for j3 = 0. 

To reduce the resource requirements of stage II of the EEA, we use overlap estimation at mul- 
tiple values of t and Eq. ©. Here is the modified stage. We assume that K > 2. 

Stage IT. 

0. Choose # max and t so that they satisfy 

(A) 0^/((K + 1)(1 - ^ max )^+ 1 ) < (t/2)p/A, 

(B) G{9 m&x /t) < 9 max p/(8K2 K ), 

(C) 9 max < 1, 

(D) tA < £ max . (11) 
The parameter t should be chosen as large as possible to minimize resource requirements. 

1. For I = 1, . . .,K, obtain yi = OE( e -< A -^W 2 \ (t/2)p/(AK2 K ), 1 - (1 - c)/{2K)). 
Let y = 1. 

2. Return -lm(Y^L Q C lVl )/(t/2) + a . 

The precisions and the confidence levels in the calls to the OEA have been adjusted so that the 
final answer has the correct precision and confidence level. The explanation for this is similar to 



that for the original stage II (see Note 112411 ). 

The earlier method for finding 9 ma _ x and t is readily adapted to the constraints in stage II'. 
Constraint A" of Eq. (ITOb now reads as 

(A") A < G-\9 m&x p/{8K2 K )) < (p/8)(K + 1)(1 - 9 inax ) K + 1 /9« ax , (12) 

and we can set t = 9 max / G^ 1 (9 ma , x p / (8K2 K )). To simplify the right hand side of Eq. (fl~2l . 
we add the inequality # max < 1/(K + 1), and use the inequality 1/4 < (2/3) 3 < (1 — 
1/(K + l)) K+1 (for K > 2) to replace the right hand side by (p/32)(K + l)/9* ax . Thus 
for bounded tails, ^ max = Q(mm(l/K, (Kpf/ K )) and t = Q(mm(l/K, {Kp) l / K )), where 
we give the asymptotic dependence on K explicitly but suppress parameters not depending 
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on K or p (see Note 0D- For exponential tails, 9 max = l)(min(l/A, log(p)l) 1 ^)) 
and t = ft(min(l/(A(|log(p)| + K)),p^ K /(\ \og(p)\^ K (\ log(p)| + A")))) (see Note 
El ). For polynomial tails with exponent /3, 6> max = ^(min(l/A,p (2+/3)/(x - 1+A " /3) )) and 
t = ^mm(l/(A(2 x /( 1+ ^(i^-Y /(1+ ^)),2^^ (see Note 

01). 

With the expressions from the previous paragraph, we can estimate the resources requirements 
of stage IF. In terms of t, M and iV are 0(K 2 2 K \ log(l-c)|/tp), andT = 0(K 3 2 K \ log(l-c)|/p), 
where the powers of K account for the K calls to the OEA, the coefficient in the denomina- 
tor of the precision, and in the case of T, the factor of / in the evolution time. For bounded 
tails, we obtain M,N — 0(|log((l — c))\K 3 2 K /p 1+1 ^ K ^), where we have loosely increased 
the power of K by 1 to account for the upper bound of 0(1/ K) on t. For exponential tails, 
M,N — 0(|log((l - c)/A)|A' 4 2 A 7(p/|log(p)|) 1+1/(/<+1) ) (with appropriate increases in the 
power of K), and for polynomial tails, M,N — 0(\ log((l - c) / K) \ K 4 2 3K /p^ 13 '^) (with conser- 
vative increases in the power of K and the exponent of 2), where 7(/3, K) approaches 1 + 1/(1 +/3) 
for large K. Note that for (5 — 0, this approaches the "classical" resource bound as a function of 
precision. 

The final task of this section is to modify stage I so that the dependence of the resource require- 
ments on b is logarithmic rather than linear in b. The basic idea is to use logarithmic search to 
reduce the uncertainty in (A) to A. Define q by b = qA. 

Stage I'. 

0. Chose A minimal so that G(A) < A/6 and F(A) < 1/18. Set the initial estimate of (A) to 
a = and the initial precision to p a = b = qA. 

1. Repeat the following until p a < A: 

l.a. Sett = l/(p a + A) and obtain x = OE( e -^- a )*, 1/18, 1 - (1 - c)/(2[log 2 (g)])). 

l.b. Update a and p a according to the assignments a <— a — lm(x)/t and p a <— (A/6 + 
(5/18)(p a + A). 

We claim that at the end of this stage, we have determined (A) to within A with overall confidence 
level 1 — (1 — c)/2, so that we can continue with the second stage, as before. To verify the 
claim, it is necessary to confirm that at the end of step l.b., the updated estimate a of (A) has 
precision p a . The error in a can be bounded as we have done for stage II. Let a be the estimate 
of A used in the call to the OEA. There is an error of less than l/(18t) = (p ao + A)/18 due 
to precision of x in the call to the OEA. The remaining error is due to the approximation of 
tr((A — ao)tp) by — Im(tr(e~ i( - j4 " a ^p))- F° r eigenvalues A of A within 1/t of a, this is bounded 
by | At + Im(e~* A *)| < 1/6, which translates into an approximation error of at most l/(6£) = 
(p ao + A)/6. Eigenvalues of A further from a than 1/t = p ao + A are at least A from (A). This 
requires the inductive assumption that the |ao — (A)\ < p ao . The contribution to the mean from 
such eigenvalues is bounded by A/6, and the bias resulting from their contribution to x is at most 
F(A)/t = (p ao + A)/18. Adding up the errors gives the p a computed in step l.b. The confidence 
levels in the calls to the OEA are chosen so that the final confidence level is 1 — (1 — c)/2. To 
see this requires verifying that the number of calls of the OEA is at most [log 2 ((/)]. It suffices 
to show that if p ao > 2A, then A/6 + (5/18)(A + p ao ) < p ao /2. Rewrite the left hand side as 
(8/18) A + (5/18)p ao , which for p ao > 2A is less than (4/18)p ao + (5/18)p ao = p a j2. 

Each call to the OEA in stage II' has constant precision, which implies that M and N are both 
0(log(g)) = 0(log(6/A)) for large q. The total time T is 0(1/ A). 
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FIG. 7: Parallelization of the PEA algorithm to estimate the bit k = 3 of the phase. This replaces the 
outlined parts of the network in Fig.|5] E is an entangler such that E\Q)f = (|0 • • • 0) a + 1 1 • • • l) a )/v / 2, 
and £|100 • • • 0) a = (|0 • • • 0) a - |1 • • • l) a )/V2. E~ l is the decoding operation that maps E^\0 ■ ■ ■ 0) a = 
|+0- • -0) a , and E" 1 ^ ■ ■ ■ l) a = |— - - - 0> a , where |±) = (|0) ± \l))/\f2. The fc'th bit is estimated from 
the measurement outcome of the first ancilla qubit in the logical basis. 



V. PARALLELIZABILITY 



To what extent are the algorithms given in the previous sections parallelizable? Consider the 
OEA. At its core is the PEA with a unitary operator S that has two eigenvalues e ±J * on the rele- 
vant state space. In the sequential implementation, one of the eigenvalues is eventually obtained 
with the desired precision. Which eigenvalue is returned cannot be predicted beforehand. The 
initial state is such that each one has equal probability. If it is possible to deterministically (or 
near-deterministically) prepare an eigenstate |^) with (say) eigenvalue e*^ using sufficiently few 
resources, then we can use the entanglement trick in 111 ill to parallelize the algorithm. Instead 
of applying V sequentially 2 k ~ 1 many times to determine bit k of the phase, we prepare the en- 
tangled state (|0 . . . 0) a + |1 . . . l) )/\/2 on 2 k ~ 1 ancilla qubits and 2 k ~ 1 copies of |^). We next 
apply °S between the j'th ancilla and the j'th copy of and then make a measurement of 
(|0 . . . 0) ± |1 . . . 1) )/y/2). On a quantum computer, the measurement requires decoding the su- 
perposition into a qubit, which can be done with Q(2 k ) gates. The decoding procedure can be 
parallelized to reduce the time to 0{k) (see Note I28IO . Using this trick reduces the time of the 
PEA to 0(log(l/p)) (the number of bits to be determined), counting only the sequential uses of 
U and ignoring the complexity of preparing the initial states |^) and the decoding overhead in 
the measurement. The repetitions required for achieving the desired confidence level are trivially 
parallelizable and do not contribute to the time. It is possible to reduce the time from 0(log(l /p)) 
to 0(1) by avoiding the feed-forward phase correction used in the algorithm and reverting to the 
algorithm in iflill and mentioned in fl]. 

Based on the discussion in the previous paragraph, the main obstacle to parallelizing the OEA 
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is the preparation of |^). If = 2 arccos(|tr(5'p)|) is not close to 0, |^) can be prepared near 
deterministically with relatively few resources as follows. Suppose we have a lower bound e on 
0. With the original initial state, use sequential phase estimation with precision e/2 and confi- 
dence level 1 — (1 — c)p/B to determine whether we have projected onto the eigenstate |^) with 
eigenvalue e 1 ^ or the one with e~ 1 ^. The occurrence of p in the confidence level accounts for the 
total number of states that need to be prepared. The parameter B is a constant that provides an 
additional adjustment to the confidence level. It must be chosen sufficiently large, and other con- 
fidence level parameters must be adjusted accordingly, to achieve the desired overall confidence 
level. If we have projected onto |^), return the state. If not, either try again, or adapt the par- 
allel PEA to use the inverse operator instead of S for this instance of the initial state. The 
(sequential) resources required are of the order of | log((l — c)p)\/e, but all the needed states can 
be prepared in parallel. For e constant, the time required by the parallel PEA is increased by a 
factor of 0(| log((l — c)p)\). The parallel overlap estimation for a unitary operator U based on 
these variations of phase estimation thus requires 0(| log((l — c)p)\) time, provided K^IC/^)! is 
not too close to 1. 

For | (ijj\U\ijj) | close to 1, the OEA is intrinsically not parallelizable without increasing the total 
resource cost by a factor of up to 0( v /p). This is due to the results in l29ll . where it is shown that 
Grover's algorithm cannot be parallelized without reducing the performance to that of classical 
search. For example, consider the problem of determining which unique state \k) of the states 
|0), . . . \2 n ) has its sign flipped by a "black-box" unitary operator V. This can be done with n 
many uses of the OEA by preparing the states \ipb) that are uniform superpositions of the \i) for 
which the number i has 1 as its 6'th bit. If (tp b \U\tp b ) = l/2 n ~ 1 , then the 6'th bit of k is 1. If 
(ipb\U\^pb) = 0, then it is 0. It suffices to use an unparametrized (Fig. |4|) precision of l/2 n ~ 1 
and confidence level sufficiently much bigger than 1 — 1/n. Because |1 — cos(0)| = O(0 2 ), 
the parameterized precision required is 0(l/2 n / 2 ). (O(x) is a quantity that is both O(x) and 
fl(x).) Thus 0(n2 n / 2 ) sequential resources suffice, which is close to the optimum attained by 
Grover's algorithm. However, the results of I29I1 imply that implementing quantum search with 
depth (sequential time) d requires VL(2 n /d) uses of V for d < 2 n / 2 . This implies that to achieve 
a parameterized precision of 6(l/2 n / 2 ) for 1 - \{i)\U\ip)\ = 0(1 /2 n ) using time 0(2 n / 2 /P) 
requires VL{2 n / 2 P) resources. 

The EEA was described so that overlap estimation is used with small cf), and therefore can not 
be immediately parallelized without loss of precision or larger resource requirements. However, 
for the version of overlap estimation needed for stages I' and IF, it is only the imaginary part of 
the overlap that is needed, and the parameters are chosen so that the overlap's phase is expected 
to be within 1 of (because 6> max < 1). The actual precision required is absolute in the overlap, 
not the parameterization of the overlap in terms of the upper hemisphere in Fig. |4j This implies 
that we can call the parallel overlap algorithm with an intentionally suppressed overlap. If the 
desired overlap is (ip\U\ip), one way to suppress it is to replace by U^ aS > and the initial state 
by (1^/2) The suppression ensures that the phases in the calls to the PEA are sufficiently 
distinguishable to allow the near deterministic preparation of the appropriate eigenstates discussed 
above. This adds at most a constant overhead to the EEA due to the additional precision required 
to account for the scaling associated with the overlap suppression. 
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